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Abstract 

In this paper we derive polynomial time algorithms that generate random k- 
noncrossing matchings and A;-noncrossing RNA structures with uniform prob- 
ability. Our approach employs the bijection between /c-noncrossing matchings 
and oscillating tableaux and the P-recursiveness of the cardinalities of k- 
noncrossing matchings. The main idea is to consider the tableaux sequences 
as paths of stochastic processes over shapes and to derive their transition 
probabilities. 
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1. Introduction 



In this paper we generate random /c-noncrossing partial matchings and 
A;-noncrossing RNA structures with uniform probability in polynomial time. 
Three decades ago Waterman pioneered the concept of RNA secondary 



structures 



31, 30 



These coarse grained RNA structures are subject to strict 
combinatorial constraints: there exist no two arcs that cross in their diagram 
representation. It is well-known, however, that there exist cross-serial inter- 

These configurations are called pseudo- 
ribosomal RNA 19 



actions, i.e. crossing base pairs [24 



knots (32| and occur in functional RNA (RNAseP [2]J 
and are conserved in the catalytic core of group I introns. Pseudoknots ap- 
pear in plant viral RNAs and in vitro RNA evolution [2^ experiments have 
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produced families of RNA structures with pseudoknot motifs, when binding 
HIV-1 reverse transcriptase, fc-noncrossing RNA structures 16| allow to ex- 
press pseudoknots and generalize the concept of RNA secondary structures 
in a natural way. Due to their cross-serial interactions pseudoknot struc- 
tures cannot be recursively generated. This renders the ab initio folding into 
minimum free energy configurations 1^ as well as the derivation of detailed 
statistical properties a difficult task. 

A partial matching over [ri] = {1, . . . , n} is a labeled graph on [n], hav- 
ing vertices of degree at most one, represented by drawing its vertices in 
increasing order on a horizontal line and its arcs {i,j) in the upper halfplane. 
Without loss of generality we shall assume i < j. Two arcs (^i, ji) and (^2,^2) 
are crossing if ii < 12 < ji < j2 holds and nesting if ii < 12 < 32 < ji- A 
/c-crossing is a sequence of arcs (ii, ji), . . . , {ik,jk) such that ii < Z2 < ■ ■ ■ < 
ik < ji < j2 < ■ ■ ■ < jk- There is an analogous notion of a fc- nesting. A 
partial matching is called /c-noncrossing (/c-nonnesting) if it does not contain 
any A;-crossing (fc-nesting). A partial matching without isolated vertices is 
called a matching. The numbers of /c-noncrossing partial matchings and k- 
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Figure 1: /c-noncrossing diagrams: a 4-noncrossing (left) and a 3-noncrossing dia- 
gram (right). 



noncrossing matchings over [n] are denoted by f^in) and fk{n), respectively. 
A fc- noncrossing RNA structure [lil, 16] is a A;-noncrossing partial matching 
without arcs of the form {i,i + 1), 1 < i < n — 1. 

For A;-noncrossing partial matchings there exists no relation expressing 
/c-noncrossing partial matchings over n vertices via those over j < n vertices 
as, for instance, the path-concatenation formula expressing Motzkin-paths, 
see Figure [2], of length n via shorter paths depending on whether the initial 
step is an up-step or a horizontal-step. Flajolet et al. Q have shown how 
to uniformly generate via Boltzman generators elements of a combinatorial 
class, for which such a recurrence exists. However, there is no comparable 
framework for the uniform generation of elements a non-inductive combina- 
torial class, a question arguably of pure mathematical interest. The subject 
of uniform generation via Markov-processes has been studied extensively. 
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A computational study on the uniform generation of random graphs via 
Markov-chains has been given by 28|]. Work on the uniform generation of 
specific graphs in the context of parallel random access machine (PRAM) 



can be found in (3J] and Jerrum et al. |l3l . Il4l . |25| | studied approximation 
algorithms in the context of rapidly mixing Markov-chains [H . 

The motivation of this paper comes from the above mentioned RNA pseu- 



doknot structures [2^, modeled as A;-noncrossing partial matchings without 
1-arcs [15|,ll6|, see FigureO At present time only a few statistical results, that 
is, central limit and discrete limit theorems, derived via singularity analysis of 



the corresponding bivariate generating functions are known [17|, [ill, [lOl, Il8 



The results of this paper do not only facilitate the derivation of detailed 
statistics of RNA pseudoknot structures but also open the door for studies 
along the lines of and novel, randomized folding routines. Our algorithms 
are freely available at http : //www, combinat orics . cn/ cbpc/unif .h tml , 

Our approach is as follows: we consider the bijection between k- 
noncrossing partial matchings and specific sequences of Ferrers diagrams and 
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Figure 3: The Hepatitis Delta Virus (HDV)-pseudoknot structure represented as a 
planar graph and as a diagram: we display the 3-noncrossing structure as folded 
by the ab initio folding algorithm cross [13] (left) and the diagram representation 
(right). 



26| of the ordinary generating function. In some 



then use the D-finiteness 
sense, D-finiteness is the next best thing if constructive recurrences are not 



available. D-finiteness implies P-recursiveness 26|, i.e. the existence of a 
finite recurrence relation for the cardinalities of the combinatorial class with 
polynomial coefficients. Therefore the key quantities, i.e. the transition prob- 
abilities of the specific stochastic processes can be derived with linear time 
complexity. 



2. The bijection 

In this section we recall the main ideas on the bijection between k- 
noncrossing partial matchings and *-tableaux, a specific class of vacillating 
tableaux j^. The bijection facilitates the interpretation of A;-noncrossing 
partial matchings and A;-noncrossing structures as paths of the stochastic 
processes. 

A Ferrers diagram, or shape. A, is a collection of squares arranged in 
left-justified rows with weakly decreasing number of boxes in each row. A 
standard Young tableau (SYT), denoted by T, is a filling of the squares 
by numbers which is strictly decreasing in each row and in each column. 
We refer to standard Young tableaux as Young tableaux. Elements can be 



inserted into SYT via the RSK-algorithm [27[. We will refer to SYT simply 
as tableaux. A ^-tableaux of shape A and of length n is a sequence of shapes 
(A*)"^o; where A° = 0, A" = A, such that for all 1 < i < n, the shape A' 
is obtained from A*~^ by either adding one square, removing one square, or 
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doing nothing. A ^-tableaux, in which any two subsequent shapes A'~^,A* 
are different is called an oscillating tableaux. 

Our first observation ^ puts RSK-insertion into context with ^-tableaux. 
It is the key for proving Theorem [H which establishes the bijection between 
^-tableaux of empty shape and length n, having at most {k — 1) rows and 
/c-noncrossing, partial matchings on [n]. It may be viewed as a "reverse" 
RSK, facilitating the construction of ^-tableaux via partial matchings. 

Lemma 1. f^, Q/ Suppose we are given two shapes A* C A*~^, which differ by 
exactly one square. Let Tj„i and Ti be SYT of shape and X\ respectively. 
Then there exists a unique j contained in Tj_i and a unique tableau Tj such 
that Ti-i is obtained from Ti by inserting j via the KSK-algorithm. 

We shall proceed by illustrating how the bijection 0] works. Given a 
♦-tableaux of empty shape, (0, A^, . . . , A*^"^, 0), reading A* \ A*~^ from left 
to right, at step i, we do the following: 

• for a +n-step we insert i into the new square 

• for a 0-step we do nothing 

• for a — D-step we extract the unique entry, of the tableaux T*~^, which 
via RSK-insertion into T* recovers it (Lemma [T]). The latter extractions 
generate the arc-set {{i,j{i)) | z is a — D-step} of a A;-noncrossing, partial 
matching, see Fig. HI 

12 3 45 5 7 89 10 11 

(2,5) (1,6} (4,9) (3,10) (8,11) 
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Figure 4: From ^-tableaux to partial matchings, sec Lemma [T] If A' \ A'^^ = — □, then 
the unique number is extracted, which, if RSK- inserted into A*, recovers A'^^. This yields 
the arc-set of a fc-noncrossing, partial matching. 

Given a A;-noncrossing partial matching, we next construct a unique *- 
tableaux as follows: starting with the empty shape, consider the sequence 
{n,n — 1, . . . ,1) and do the following: 
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• if j is the endpoint of an arc then RSK-insert i 

• if j is the startpoint of an arc {j, s), then remove the square containing j. 

• if j is an isolated point, then do nothing, see Fig. [51 




12 3 45 6 7 89 10 11 

Figure 5: From partial matchings to ^-tableaux via RSK insertion of the origins of arcs. 

The above construction leads to 

Theorem 1. f^J Each ^-tableaux of length n, containing shapes with at most 
{k — l)-rows, corresponds uniquely to a k-noncrossing partial matching on 
[n]. 

Of course, the above bijection induces a correspondence between oscillat- 
ing tableaux and /c-noncrossing matchings. 

3. D-finiteness 

Suppose (A*)"^Q is a *-tableaux of shape A having at most {k — 1) rows. 
Let Ol{X\n — i) and 0°(A*,n — i) denote the numbers of ^-tableaux and 
oscillating tableaux of shape A* and length (n — i), respectively. In this 
section we establish that these quantities can be computed with 0{n) time 
and space complexity. In addition, in the case of A; = 3, we derive explicit 
formulas. 

3.1. The exponential generating function 

Given a ^-tableaux of shape A, (A*)"^q, where A" = A, we consider the 
number of squares in the sth row of shape A*, denoted by Xs{i)- It is evident 
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that a ^-tableaux of shape A with at most {k — 1) rows uniquely corresponds 
to a walk of length n which starts at a = {k — 1, k — 2, . . . ,1) and ends at 
b = {k — 1 + xi{n), . . . , 1 + Xk-i{n)) having steps 0, ie^, 1 < i < k — 1 such 
that < Xk-i < . . . < xi at any step. That is, a ^-tableaux of shape A with 
at most {k — 1) rows corresponds to a lattice path in that remains in 
the interior of the dominant Weyl chamber j^. For a,b E Z'^^^, let ^*{a,b) 
denote a walk of length n which starts at a = (A; — 1, A; — 2, . . . , 1), ends at 
b and that has steps 0, ±ei, 1 < i < k — 1 such that < Xk~i < . . . < xi 
at any step. Let T* (a, b) be the number of such walks. For our purposes, 
it suffices to consider walks without 0-steps. To this end, let %{a, b) denote 
a 7*(a, 6)-walk that does not contain any zero-steps and let F^ (a, 6) denote 
their number. In case of a = 6 = (/c — 1, . . . , 1), Theorem [T] implies 

F:(a,6) = 0^(A,n) and F" (a, 6) = 0°(A, n), (3.1) 

where A represents the unique shape with at most (A; — 1) rows that cor- 
responds to the lattice point b E Z''~^. Let Ir{2x) = X]j>o -^{r+jy. 
hyperbolic Bessel function of the first kind of order r. The folfowing lemma 



is implied by the reflection principle [22] and due to Grabiner and Magyar 



jol. It expresses the exponential generation functions of F* (a, b) and F° (a, b) 
via a determinant of Bessel functions: 

Lemma 2. f^] The exponential generating function for the numbers of k- 
noncrossing matchings, F^(a,6), is given by 



Ern(«,^)|^ = det[4._,^.(2x) - /a.+.,(2x)]|f-i,. (3.2) 



n>0 



Consequently we have an algebraic relation for the exponential generating 
function. Since D-finite functions form an algebra j26|, the above relation 
implies, that the ordinary generating function ^„>o r° (a, f>)2;"' is also In- 
finite. That is, the ordinary generating function of oscillating tableaux with 



at most {k — 1) rows of arbitrary shape A, X]n>o 0°(Aj_n)x", is D-finite [26 
Since D-finitness is equivalent to the P-recursiveness [27|, we have 

Corollary 1. For fixed shape A with at most {k — 1) rows and n eN, there 
exists some m eN and polynomials po{n), . . . ,Pm{n) such that 

Pm{n)Ol{X, n + m) + ---+ po(n)0°(A, n) = 0. (3.3) 

In particular, the numbers 0^{X,n) can be computed in 0{n) time. 
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The key point here is, that for given n and A, the derivation of eq. fl3.3p is 
a preprocessing step. It has to be derived only once, for instance employing 
Zeilberger's algorithm 33|, |23l . 



3.2. The case of 3-noncrossing partial matchings 

Let (7ri,7r2) be a ^-pair with the endpoints Vu = {n,hu), u = 1,2 and 
let t{n, hi, denote the number of such pairs. A ^-pair can readily be 
identified with a pair of non-intersecting paths, (TT^jTTg) as follows: 

7r[ = ((2,7ri(^) + 2))i 

TTg = 712. 

Since {tt[,7i'2) are non-intersecting paths, Lindstrom's theorem [23,0] allows 
to compute t{n,hi,h2). Since vr^(O) = 2, 7r2(0) = 0, vr^(n) = hi + 2 and 
7r2(n) = /i2 

f(v h. hn) - J"''^l+2)J".'i2) _ JnM) {n,hi+2) 
Cl^a, Ail, Ai2j — P(Q 2) P(o,0) Pi0,2) PiOfi) ' 

Inserting two up-steps, we observe that tt^ uniquely corresponds to a path 
starting at (—2, 0) and ending at (n, hi + 2) of length n + 2, that does not 
cross the line y = and has only up- and down-steps. Let F{n, h) denote the 
number of paths of length n starting at (0, 0) and ending at (n, h), having 
up- and down-steps and that stay within the first quadrant. Then |22[ | 

Lemma 3. 

nn,h) = - Ll-X (3.5) 

Proof: Clearly, there are (n\) paths having up- and down-steps, that 

2 

start at (0, 0) and end at {n, h). We call such a path good, if it never touches 
the line y = —1 and bad, otherwise. Reflecting the segment of a bad path 
which starts at (0, 0) and ends at the first intersection point at the line 
y = —l,we observe that the set of bad paths is in one-to-one correspondence 
with the set of all paths having only up- and down-steps from (0, —2) to 
(n,h). Subtracting the number of these paths, ( n-ft-2 ), from the number of 

2 

all paths, the lemma follows. 

We can now give explicit formulas for 03(A*, n — i) and OiKA*, n — i) j^, 0] 
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Corollary 2. Let A^^ denote the shape with at most two rows, where 
Xi\n) + X2\n) = hi and x^'(n) — X2 (n) = /i2- Then we have 

= F{{n - z) + 2, /ii + 2)F{{n - z), /la)- (3.6) 
F{{n-i) + 2,h2)F{n-i,hi + 2) 



h2 ' 



n — z] 



Elo C^%n-t-2l,h,h2), 

for {n — i) even 

n 

Ef=o (2"+i)i(^-^-2/-l,/ii,/i2), 
for in — i] odd. 



(3.7) 



4. Random k-noncrossing partial matchings 

In this section we generate A;-noncrossing partial matchings with uniform 
probability. The construction is as follows: first we compute for any shape A, 
having at most [k — 1) rows, the recursion relation of Corollary [TJ Second we 
compute the array (0^(A*,n — i))\,{n-i), indexed by A and (n — i). Then we 
specify a Markov-process that constructs a /c-noncrossing partial matching 
with uniform probability with linear time and space complexity. 

Theorem 2. Random k-noncrossing partial matchings can he generated with 
uniform probability in polynomial time. The algorithmic implementation, see 
AlgorithmUl has 0{n^'^^) preprocessing time and 0{n^) space complexity. 
Each k-noncrossing partial matching is generated with 0{n) time and space 
complexity. 



Algorithm 1. 

1 : Pascal ^ B\nom\a\(n) (computation of all binomial 

coefficients, B{n,h).) 

2 : PShape ^ ArrayP(^n,A;j (computation of Ol{X\n — i), i = 

0, 1, . . . , n — 1, A*, stored in the k x n array, 
PShape ) 

3 : while i < n do 

4 : for j from Q to k — 1 do 

5: X[]]^Ol{y^\n-{i + l)) 
6 : sum ^ sum+X[j] 
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7 : end for 

8 : Shape ^ Random (sum) ('Random generates the random 

9 : shape y'^^) 

10 : + 1 

11 : Insert Shape into Tableaux (the sequence of shapes). 

12 : end while 

13 : Map(Tableaux) (maps Tableaux into its corresponding 

partial matching) 

Figure[6]illustrates that Algorithm 1 indeed generates each A;-noncrossing 
partial matching with uniform probability. 

0.012 
0.010 
0.008 
0.006 
0.004 
0.002 





Figure 6: Uniformity: for n = 12 we have m = f^{12) = 99991 distinct 3- 
noncrossing partial matchings. We generate via Algorithm 1 = 10^ ran- 
domly and display the frequency distribution of their multiplicities (black dots) 
and (^) (l/m)^(l — l/m)^~^ (red curve). 

Proof: Suppose (A*)"^q is a ^-tableaux of shape A having at most {k — 1) 
rows. By definition, a shape A*"^^ does only depend on its predecessor, A*. 
Accordingly, we can interpret any given ^-tableaux of shape A as a path of a 
Markov-process (A*)"^q over shapes, given as follows: 
• A° = A" = and A* is a shape having at most (fc — 1) rows 
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• for 0<i<n — 1, X"^ and X^~^^ differ by at most one square 

• the transition probabilities are given by 




(4.1) 



We next observe 



n 



1 



1 



A*+i I X' = \') 





(4.2) 



i=0 



where 



/ n 

Elo (y)O0(A^n-^-2/), 



< 



for {n — i) even 



(4.3) 



for (n — i) odd. 



Accordingly, the Markov-process, (X*)"^q, generates /c-noncrossing partial 
matchings with uniform probability. Clearly, the Pascal triangle of binomial 
coefficients can be generated in O(n^) time and space and for any fixed A*, 
having at most [k — 1) rows, we can via Corollary [1] compute 0°(A*, n — i) in 
0{n) time. Consequently, we can generate the array of numbers 0°(A*, n — z) 
as well as 0^(A*, n — i) for all shapes A in O(n^) + 0{n) 0{n) 0{n^~^) time 
and 0{n^) space. The first factor 0(n) represents the time complexity for 
deriving the recursion and the second comes from the computation of all 
numbers 0^(A*, n — i) for fixed A = A* for all {n — i). 

As for the generation of a random A;-noncrossing partial matching, for 
each shape A*, the transition probabilities can the be derived in 0(1) time. 
Therefore, a /c-noncrossing partial matching can be computed with 0{n) time 
and space complexity, whence the theorem. □ 

5. Random k-noncrossing RNA structures 

In this section we generate A;-noncrossing structures with uniform prob- 
ability. The approach is analogous to that of the previous section, however 
the stochastic process required here is not a Markov-process any more. In 
order to avoid generating arcs of the form {i,i + 1) (1-arcs), some kind of 
one-step memory is needed. 
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To formalize this intuition, we shall begin by giving the formula for the 
number of /c-noncrossing structures, or equivalently partial matchings with- 



out 1-arcs 15 



n 

Skin) = ~ ^) mn - 2b). (5.1) 

6=0 ^ 

We observe that a 1-arc corresponds via Theorem [1] to a subsequence of 
shapes (A\ A*"*"^, A*"^^ = A*), obtained by first adding and then removing 
a square in the first row. This sequence corresponds to a pair of steps 
(+□1, — Di), where +□! and — Di indicate that a square is added and sub- 
tracted in the first row, respectively. 

Let Qfc(A*, n — i, j) denote the set of *-tableaux of shape A* of length {n — i) 
having at most [k — 1) rows containing exactly j pairs (+□!, — Di) and set 
Ql{X\n — i,j) = \Ql{X\n — i,j)\. Furthermore, let W^(A*,n — z) denote the 
number of *-tableaux of shape A' with at most [k — 1) rows of length [n — i) 
that do not contain any such pair of steps. 

In terms of ^-tableaux having at most (A; — 1) rows, eq. (15. ip can be 

rewritten as follows \Nl{0,n) = E6=o(-l)^(v)Ofc(0, ^ - 26). We proceed 
by generalizing this relation from the empty shape to arbitrary shapes. 

Lemma 4. Let A* be an arbitrary shape with at most (k — 1) rows, then 

WliW n-z) = (^'' ~b'^) ^ - ^ - 2^)- (5-2) 

6=0 ^ ^ 

Proof: Let (A*)^!Lq^^'' * be a *-tableaux of shape A*. We select from the 
set {0, . . . , {n — 2b) — i — 1} an increasing sequence of labels (ri, . . . , r?,). For 
each rs we insert a pair (+□!, — Di) after the corresponding shape A^% see 
Fig. [71 This insertion generates a ^-tableaux of length [n — i) of shape A*. 

Considering the above insertion for all sequences (ri, . . . , r^,), we arrive 
at a family 3^f, of *-tableaux of length (n — i) containing at least b pairs, 
(+□1, — Di). Since we can insert at any position < h < ((n — i) — 2b — 
1), 3^b has cardinality ('•"'~*^~^)0^(A*, n — i — 2b). By construction, each *- 
tableaux (A'')"~q G IFfe, that exhibits exactly j pairs (+□!, — Di) appears with 
multiplicity (-^j, whence 

E ^ - ^' 3) = (^"^ ~b~^) n-^-2b). (5.3) 
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□ m ^ 

>! X' r 



X' X X'' 



■to. -□. 



A X' X' 



'F^ F 


F ^ 




^ F B 


F B 



X X 



Figure 7: Illustration of the proof idea: pairs (+□!, — Di) are inserted at positions 3,5 
and 8, respectively. 



We consider Fk{x) = Xlj>o ~ hi)^^ ■ Taking the 6th derivative and 

setting X = 1 we obtain ^ ^^(1) = Tl,j>h (&) n—i,j)V~^ and computing 

the Taylor expansion of Fk{x) at x = 1 



6>0 



E 

6=0 



'n — i) — h 
h 



Ol{\\n-i-2h) {x-lf. 



Since \Nl{X\n — i) = Ql{X\n — i,0) is the constant term of Fk{x), the 
lemma follows. 

We can now prove 

Theorem 3. A random k-noncrossing structure can be generated, after poly- 
nomial preprocessing time, with uniform probability in linear time. The al- 
gorithmic implementation, see Algorithmic has 0{n^~^^) pre-processing time 
and 0{n^) space complexity. Each k-noncrossing structure is generated with 
0{n) space and time complexity. 

Algorithm 2. 

1 : Pascal ^ Binomialfri^ (computation of all 

binomial coefficients, B{n,h).) 

2 : PShape ^ ArrayP (n,k) (computation ci/0^(A*,n — i), i = 

0,l,...,n-l, AV 

3 : SShape ^ krr3)/S)(n,k) (computation ofVyKX^n — i), j = 

0. 1+,l-,...,(A;-l)+,(A;-l)-;z = 0,l,...,n- 

1, stored in the k x n array SShape) 
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4 : while i < n do 

5 : flag ^ 1 

6: X^OJ^\Nl{X'^\n-{^ + l)) 

7 : X[1J^ \NliX\t\ n - (z + 1)) - W*(Ai+J, n - (z + 2)) 

8 : if flag=0 and j=2 then 

9 : X[2J^ 

10 : else 

11: X[2j^\Nl{X\t\n-{z + l)) 

12 : end if 

13 : sum 4- X[0j+X[lJ+X[2j 

14 : iorj from 2 to k — 1 do 

15: X/^-i/^ W*(A;t\n-(z + l)) 
16: X/^y^ W*(A;.t\n-(z + l)) 

17 : sum^sum+X[2j-lJ+X[2jJ 

18 : end for 

19 : Shape ^ Random (sum) ('Random generates the random 

shape A*"*"^ with probability X[j]/sum) 

20 : i^ i + 1 

21 : if Shape =X\+ then 

22 : flag ^ 

23 : end if 

24 : Insert A*"*"^ into Tableaux 

25 : end while 

26 : Map(Tableaux) 

Figure [8] illustrates that Algorithm 2 generates /c-noncrossing RNA 
structures with uniform probability. 

Proof: The idea is to interpret *-tableaux without pairs of steps, (+□!, — Di), 
(good *-tableaux) as paths of a stochastic process. 

For this purpose we index the shapes A*"*"^ according to their predecessors: 
let z = 0, 1, . . . , - 1 and J G {0, 1+, 1", . . . , (A; - 1)+, (A; - 1)"}. Setting 
A° = 0, we write A*"^^, if A*"^^ is obtained via 

• doing nothing (Ag^^) 

• adding a square in the jth row (A*+^) 

• deleting a square in the jth. row (A*i^). 

With this notation, the number of good ^-tableaux of shape A^+^ of length 
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Figure 8: Uniformity: for n = 12 we have m = 5*3 (n) = 38635 distinct 3-noncrossing 
RNA structures. We generate via Algorithm 2 = 3 x lO'^ of them and display 
the frequency distribution of their multipHcities (blue dots) and (^)(l/m)^(l — 
\/m)^~^ (red curve). 



(n — (i + 1)) is given as follows: 

\Jl{y+\n + 1)) = W*(Air , n - (^ + 1)) - W*(Ai^^ n-{i + 2)). 

In order to derive transition probabilities, we establish two equations: first, 
for any A*, where j 7^ l"*" 

w*(A;,n-z) 

= V*(AitS n - (z + 1)) + W*(Ait\ n - (z + 1)) + 
fc-i 

E (Wfc(AlV, + 1)) + W*(A;,t\ n - (z + 1))) + 

h=2 

W*(A^+\n-(z + l)) 



15 



and second, in case of j = 1 



V*(A;r,n 



1)) 



+1 
h+ ' 



n — [I 



1))+W*(A 



71 



+ 1))) 



h=2 

w*(a: 



!))■ 



We are now in position to specify the process (X')"^q: 

• = X" = and X* is a shape having at most {k — 1) rows 

• for < i < n — 1, and X*"*"^ differ by at most one square 

• there exists no subsequence X*, X*+^, X*"*"^ = X* obtained by first adding 
and second removing a square in the first row 

• for j ^ 



W*(A;+i,n-(i+l)) 



P(X^+^ 
for j = 1"*" 



X' = a;) 



P(X 



A^^ 



X* = A 



W*(A5 


n— i) 


V*(A;+\n 


-(i+l)) 




n— j) ' 


W*(A;+i,n 


-(i+l)) 




n— i) ' 
f( 


V*(A;+\n- 


-(i+l)) 



for / ^ 1+ 
for / = 1+ 



(5.4) 



for / ^ i+,r 

for / = 1+ 



(5.5) 



V:.(A" + ,n-i) 

As in the proof of Theorem [2] we observe that eq. (15.41) and eq. (15. 5p imply 



n-l 



Y[ P(X*+i = I X^ = X') 



i=Q 



W*(A" = 0,0) 
W*(AO = 0,n) 



W*(0,n) 



(5.6) 



Consequently, the process (X*)"^q generates random /c-noncrossing structures 
with uniform probability in 0{n) time and space. According to Corollary [H 
we can for any A*, having at most {k — 1) rows, compute 0^{X\n — i) in 
0{n) time. Consequently, we can generate the arrays {Ol{X\n — i))xi^n-i 
and (\Nl{X\ n - i))xi^n-i in O(n^) + 0{n^) 0{n^~'^) time and 0{n^) space. 

A random fc-noncrossing structure is then generated as a ^-tableaux with 
at most {k— 1) rows using the array (W^(A*, tt, — z));^» „„j with 0{n) time and 
space complexity. □ 
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